Global optimum protein threading with gapped alignment and empirical pair score functions.

نویسندگان

  • R H Lathrop
  • T F Smith
چکیده

We describe a branch-and-bound search algorithm for finding the exact global optimum gapped sequence-structure alignment ("threading") between a protein sequence and a protein core or structural model, using an arbitrary amino acid pair score function (e.g. contact potentials, knowledge-based potentials, potentials of mean force, etc.). The search method imposes minimal conditions on how structural environments are defined or the form of the score function, and allows arbitrary sequence-specific functions for scoring loops and active site residues. Consequently the search method can be used with many different score functions and threading methodologies; this paper illustrates five from the literature. On a desktop workstation running LISP, we have found the global optimum protein sequence-structure alignment in NP-hard search spaces as large as 9.6 x 10(31), at rates ranging as high as 6.8 x 10(28) equivalent threadings per second (most of which are pruned before they ever are examined explicitly). Continuing the procedure past the global optimum enumerates successive candidate threadings in monotonically increasing score order. We give efficient algorithms for search space size, uniform random sampling, segment placement probabilities, mean, standard deviation and partition function. The method should prove useful for structure prediction, as well as for critical evaluation of new pair score functions.

منابع مشابه

An Anytime Local-to-Global Optimization Algorithm for Protein Threading in O(m2n2) Space

This paper describes a novel anytime branch-and-bound or best-first threading search algorithm for gapped block protein sequence-structure alignment with general sequence residue pair interactions. The new algorithm (1) returns a good approximate answer quickly, (2) iteratively improves that answer to the global optimum if allowed more time, (3) eventually produces a proof that the final answer...

متن کامل

Failures of inverse folding and threading with gapped alignment.

To calculate the tertiary structure of a protein from its amino acid sequence, the thermodynamic approach requires a potential function of sequence and conformation that has its global minimum at the native conformation for many different proteins. Here we study the behavior of such functions for the simplest model system that still has some of the features of the protein folding problem, namel...

متن کامل

Confidence Measures for Fold Recognition

It is a standard procedure to compare new amino acid sequences to databases of proteins that have been studied already in order to find similarities in structure and function. This comparison can be sequence–sequence or sequence– structure based. In order to compare, an alignment is performed of the target protein sequence (whose structure we are searching) with a template protein (whose struct...

متن کامل

Defrosting the frozen approximation: PROSPECTOR--a new approach to threading.

PROSPECTOR (PROtein Structure Predictor Employing Combined Threading to Optimize Results) is a new threading approach that uses sequence profiles to generate an initial probe-template alignment and then uses this "partly thawed" alignment in the evaluation of pair interactions. Two types of sequence profiles are used: the close set, composed of sequences in which sequence identity lies between ...

متن کامل

A Bayes Optimal Probability Theory That Uni es Protein Sequence Structure Recognition and Alignment

A rigorous Bayesian analysis is presented that uni es protein sequence structure align ment and recognition Given a sequence explicit formulae are derived to select its globally most probable core structure from a structure library its globally most probable alignment to a given core structure its most probable joint core structure and alignment chosen globally across the entire library and its...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:
  • Journal of molecular biology

دوره 255 4  شماره 

صفحات  -

تاریخ انتشار 1996